Developing Managed User-Defined Types (UDTs)
In the preceding section, you used a managed user-defined type (UDT) called RegexPattern to store the regular expression pattern. In this section, you explore how custom UDTs are built and used in SQL Server.
The first thing to note is that although the name UDT is the same as the extended data types built using SQL Server 2000, they are by no means the same in SQL Server 2008. SQL Server 2000’s UDTs were actually retro-named “alias data types” in SQL Server 2005. SQL Server 2008 UDTs are structs (value types) built using the .NET Framework.
To create a UDT of your own, you
right-click your Visual Studio project and then select Add,
User-Defined Type. Next, you should name both the class and its
autogenerated method RegexPattern. Notice the attribute used to decorate the RegexPattern struct: SqlUserDefinedType. Its constructor has the following parameters:
Format— Tells SQL Server how serialization (and its complement, deserialization) of the struct should be done. You specify Format.Native to let SQL Server handle serialization for you. You specify Format.UserDefined to do your own serialization.
When Format.UserDefined is specified, the struct must implement the IBinarySerialize interface to explicitly take the values from string (or int, or whatever the value passed into the constructor of the type is) back to binary and vice versa.
A named parameter list— This list contains the following:
IsFixedLength—Tells SQL Server that the byte count of the struct is the same for all its instances.
IsByteOrdered—Tells SQL Server that the bytes of the struct are ordered so that it may be used in binary comparisons, as with ORDER BY, GROUP BY, or PARTITION BY clauses, in indexing, and when the UDT is a primary or foreign key.
MaxByteSize—Tells
SQL Server not to allow more than the specified number of bytes to be
held in an instance of the UDT. The overall limit is 8KB. You must
specify this when using Format.UserDefined.
Name—Tells the deployment routine what to call the UDT when it is created in the database.
ValidationMethodName—Tells SQL Server which method of the struct to use to validate it when it has been deserialized (in certain cases).
The implementation contract for any UDT is as follows:
It must provide a static method called Parse(), used by SQL Server for conversion to the struct from a string.
It must provide an instance method that overrides the default ToString() method for converting from the struct to a string.
It must implement the INullable interface, providing a Boolean instance method called IsNull, used by SQL Server to determine whether an instance is null.
It must have a static property called Null of the type of the struct. This property returns an instance of the struct whose value is null (that is, where IsNull is true for that instance). (This concept seems to be derived from the “null object” design pattern.)
Also, you need to be aware
that UDTs can have only read-only static fields, they cannot use
inheritance, and they cannot have overloaded methods (except the
constructor, whose overloads are mainly used when ADO.NET is the calling
context).
Given these fairly stringent requirements, Listing 6 provides an implementation of a UDT representing a regular expression pattern.
Listing 6. A UDT Representing a Regular Expression Pattern
using System; using System.Data; using System.Data.Sql; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; //added using System.Text.RegularExpressions;
[Serializable] [Microsoft.SqlServer.Server.SqlUserDefinedType( Format.UserDefined, // requires IBinarySerialize IsFixedLength=false, IsByteOrdered=true, MaxByteSize=250, ValidationMethodName = "RegexPatternValidator" )] public struct RegexPattern : INullable, IBinarySerialize { //instance data fields private Regex _reg; private bool _null;
//constructor public RegexPattern(String Pattern) { _reg = new Regex(Pattern); _null = (Pattern == String.Empty); }
//instance method public override string ToString() { return _reg.ToString(); }
//instance property public bool IsNull
{ get { if (_reg == null || _reg.ToString() == string.Empty) { return true; } else return false; } }
//static method public static RegexPattern Null { get { RegexPattern NullInstance = new RegexPattern(); NullInstance._null = true; return NullInstance; } }
//static method public static RegexPattern Parse(SqlString Pattern) { if (Pattern.IsNull) return Null; else { RegexPattern u = new RegexPattern((String)Pattern); return u; } }
//private instance method private bool RegexPatternValidator() { return (_reg.ToString() != string.Empty); }
//instance method public Int32 Match(String Input) { Match m = _reg.Match(Regex.Escape(Input.ToString())); if (m != null)
return Convert.ToInt32(m.Success); else return 0; }
//instance property public bool IsFullStringMatch { get { Match m = Regex.Match(_reg.ToString(), @"\^.+\$"); if (m != null) return m.Success; else return false; } }
//instance method [SqlMethod( DataAccess = DataAccessKind.None, IsMutator = false, IsPrecise = true, OnNullCall = false, SystemDataAccess = SystemDataAccessKind.None )] public Int32 MatchingGroupCount(SqlString Input) { Match m = _reg.Match(Regex.Escape(Input.ToString())); if (m != null) return m.Groups.Count; else return 0; }
//static method [SqlMethod( DataAccess = DataAccessKind.None, IsMutator = false, IsPrecise = true, OnNullCall = false, SystemDataAccess = SystemDataAccessKind.None )] public static bool UsesLookaheads(RegexPattern p) // must be static to be called with :: syntax {
Match m = Regex.Match(p.ToString(), @ if (m != null) return m.Success; else return false; }
#region IBinarySerialize Members
public void Read(System.IO.BinaryReader r) { _reg = new Regex(r.ReadString()); }
public void Write(System.IO.BinaryWriter w) { w.Write(_reg.ToString()); }
#endregion }
|
As
you can see by scanning this code, it meets the required implementation
contract. In addition, it declares static and instance methods, as well
as instance properties. Both static and instance methods can optionally
be decorated with the SqlMethod
attribute. By default, methods of UDTs are declared to be
nondeterministic and nonmutator, meaning that they do not change the
value of the instance.
You use the named parameters of the constructor for SqlMethod to override this and other behaviors. These are its named parameters:
DataAccess— Tells SQL Server whether the method will access user table data on the server in its body. If you provide the enum value DataAccessKind.None, some optimizations may be made.
SystemDataAccess— Tells SQL Server whether the method will access system table data on the server in its body. Again, if you provide the enum value SystemDataAccessKind.None, some optimizations may be made.
IsDeterministic— Tells SQL Server whether the method always returns the same values, given the same input parameters.
IsMutator— Must be set to true if the method changes the state of the instance.
Name— Tells the deployment routine what to call the UDT when it is created in the database.
OnNullCall— Returns null if any arguments to the method are null.
InvokeIfReceiverIsNull— Indicates whether to invoke the method if the instance of the struct itself is null.
To create this type in SQL Server without using Visual Studio, you use the CREATE TYPE DDL syntax, as follows:
CREATE TYPE RegexPattern EXTERNAL NAME SQLCLR.RegexPattern
Note that DROP TYPE TypeName is also available, but there is no ALTER TYPE statement.
Let us add a few words on the code in Listing 6. The constructor to RegexPattern validates the expression passed to it via the constructor of System.Text.RegularExpressions.Regex.
If you pass an invalid regex to the T-SQL SET statement (when declaring a variable of type RegexPattern) or when the UDT is used as a table column data type and a value is modified, the Regex class does its usual pattern validation, as it does in the .NET world.
Let’s look at some of the ways
you can use your UDT. The following example shows how to call all the
public members (both static and instance) of RegexPattern:
DECLARE @rp RegexPattern
SET @rp = '(\w+)\s+?(?!bar)'
SELECT
@rp.ToString() AS ToString,
@rp.IsFullStringMatch AS FullStringMatch,
@rp.Match('uncle freddie') AS Match,
@rp.MatchingGroupCount('loves elken') AS GroupCount,
RegexPattern::UsesLookaheads(@rp) AS UsesLH
go
ToString FullStringMatch Match GroupCt UsesLH
----------------------------------------------------------
(\w+)\s+?(?!bar) 0 1 2 1
(1 row(s) affected)
Note that static members can be called (without an instance, that is) by using the following new syntax:
TypeName::MemberName(OptionalParameters)
To try this, you can create a table and populate it as shown here:
CREATE TABLE dbo.RegexTest
(
PatternId int IDENTITY(1,1),
Pattern RegexPattern
)
GO
INSERT RegexTest SELECT '\d+'
INSERT RegexTest SELECT 'foo (?:bar)'
INSERT RegexTest SELECT '(\s+()'
Msg 6522, Level 16, State 2, Line 215
A .NET Framework error occurred during execution of user defined
routine or aggregate
'RegexPattern':
System.ArgumentException: parsing "(\s+()" - Not enough )'s.
System.ArgumentException:
at System.Text.RegularExpressions.RegexParser.ScanRegex()
at System.Text.RegularExpressions.RegexParser.Parse(String re,
RegexOptions op)
at System.Text.RegularExpressions.Regex..ctor(String pattern,
RegexOptions options,
Boolean useCache)
at System.Text.RegularExpressions.Regex..ctor(String pattern)
at RegexPattern..ctor(String Pattern)
at RegexPattern.Parse(SqlString Pattern)
Do you see what happens when you try to insert an invalid regex pattern into the Pattern column (the third insert statement)? The parenthesis count is off, and the CLR tells you so in the query window’s output.
Because the UDT has the IsByteOrdered named parameter set to true, you can index this column (based on the struct’s serialized value) and use it in ORDER BY statements. Here’s an example:
CREATE NONCLUSTERED INDEX PatternIndex ON dbo.RegexTest(Pattern)
GO
SELECT
Pattern.ToString(),
RegexPattern::UsesLookaheads(Pattern)
FROM RegexTest
ORDER BY Pattern
go
PatString UsesLookaheads
---------------------------
\d+ 0
foo (?:bar) 1
(2 row(s) affected)
Back using ADO.NET, you can access the UDT by using the new SqlDbType.Udt enum
value. To try this, you can add a new C# Windows application to your
sample solution. You can add a project reference to your sample project ("SQLCLR") and then add a using statement for System.Data.SqlClient. Then you should add a list box called lbRegexes to the form. Finally, you should add a button called btnCallUDT to the form, double-click it, and add the code in Listing 7 to the body of its OnClick event handler.
Listing 7. Using a UDT from ADO.NET in a Client Application
private void btnCallUDT_Click(object sender, EventArgs e) { using (SqlConnection c = new SqlConnection(ConfigurationManager.AppSettings["connstring"])) { using (SqlCommand s = new SqlCommand("SELECT Pattern FROM dbo.RegexTest", c)) { c.Open(); SqlDataReader r = s.ExecuteReader(CommandBehavior.CloseConnection); { while (r.Read()) { RegexPattern p = (RegexPattern)r.GetValue(0); lbRegexes.Items.Add(p.ToString()); } r.Close(); } } } }
|
In this example, you selected all the rows from the sample table dbo.RegexText and then cast the Pattern column values into RegexPattern structs. Finally, you called the ToString() method of each struct, adding the text of the regex as a new item in the list box.
You can also create SqlParameter objects to be mapped to UDT columns by using code such as the following:
SqlParameter p = new SqlParameter("@Pattern", SqlDbType.Udt);
p.UdtTypeName = "RegexPattern";
p.Value = new RegexPattern("\d+\s+\d+");
command.Parameters.Add(p);
Finally, keep in mind that FOR XML does not implicitly serialize UDTs. You have to do that yourself, as in the following example:
SELECT Pattern.ToString() AS '@Regex'
FROM dbo.RegexTest
FOR XML PATH('Pattern'), ROOT('Patterns'), TYPE
go
<Patterns>
<Pattern Regex="\d+" />
<Pattern Regex="foo (?:bar)" />
</Patterns>